Evolutionary trajectory of pattern recognition receptors in plants

Cell-surface receptors play pivotal roles in many biological processes, including immunity, development, and reproduction, across diverse organisms. How cell-surface receptors evolve to become specialised in different biological processes remains elusive. To shed light on the immune-specificity of cell-surface receptors, we analyzed more than 200,000 genes encoding cell-surface receptors from 350 genomes and traced the evolutionary origin of immune-specific leucine-rich repeat receptor-like proteins (LRR-RLPs) in plants. Surprisingly, we discovered that the motifs crucial for co-receptor interaction in LRR-RLPs are closely related to those of the LRR-receptor-like kinase (RLK) subgroup Xb, which perceives phytohormones and primarily governs growth and development. Functional characterisation further reveals that LRR-RLPs initiate immune responses through their juxtamembrane and transmembrane regions, while LRR-RLK-Xb members regulate development through their cytosolic kinase domains. Our data suggest that the cell-surface receptors involved in immunity and development share a common origin. After diversification, their ectodomains, juxtamembrane, transmembrane, and cytosolic regions have either diversified or stabilised to recognise diverse ligands and activate differential downstream responses. Our work reveals a mechanism by which plants evolve to perceive diverse signals to activate the appropriate responses in a rapidly changing environment.

mental and reproductive processes 11,12 (Supplementary Fig. 1).Given the striking resemblance in their domain architecture, it is reasonable to infer that immunity-and development-related cell-surface receptors share a common origin.However, the evolutionary trajectory that led to their divergence and specialisation in distinct biological processes remains poorly understood.

Results
The origin and expansion of cell-surface receptors in the plant lineage Plant cell-surface receptors that are known to participate in immunity, development, and reproductive processes include the LRR-, G-lectin-, Wall-associated kinase (WAK)-, Domain of Unknown Function 26 -0.3 0.0 0.3 0.6 -6 -4 -2 0 2 4 6 (Duf26)-, L-lectin-, Lysin motif (LysM)-, and Malectin-containing RLKs and RLPs (Fig. 1a-h).There are additional RLK families with different ectodomains, such as the proline-rich extensin-like receptor kinases (PERKs) and thaumatin-like protein kinases (TLPKs) 9,13 .However, their function in immunity is not well-characterized.Cell-surface receptors with LRR-, G-lectin-, WAK-, and LysM-ectodomains have been reported to recognise PAMPs, while others perceive self-molecules or unidentified ligands (Fig. 1h; Supplementary Fig. 1).Recognition of the diverse array of ligands is likely to be accomplished by variable structures and combinations of different ectodomains (Fig. 1a-g).To trace the origins of different receptor classes within the plant lineage, we first identified RLKs and RLPs in 350 genomes from Glaucophyta, red algae, green algae, Bryophytes, and Tracheophytes.We define here RLKs as any proteins with both 1-2 TMs and KDs, and RLPs as any protein with 1-2 TMs, but lack KDs.In total, we identified 177,645 RLKs, almost up to 70% of which possess either LRR-, G-lectin-, WAK-, Duf26-, L-lectin-, LysM-and Malectin-ectodomains (Fig. 1i).Next, we searched for proteins with these ectodomains and TMs that lack KDs and found 41,144 RLPs (Fig. 1j).We further examined which of the identified RLKs and RLPs families are likely to be involved in immunity.A previous report suggested a positive correlation between the gene family sizes of cellsurface immune receptors and intracellular immune receptors (the NB-ARC family) across the angiosperms 4 .We examined the correlation between the relative size (%; number of identified genes in the family/ numbers of searched genes × 100; see methods) of the RLK families, the RLPs families, and the NB-ARC family in each genome.Notably, most RLK families (except for the LysM-RLKs) exhibit positive correlations with the NB-ARC family, while most RLP families (except for the LRR-RLPs) do not exhibit positive correlation with the NB-ARC family (Main Fig. 1k).Furthermore, we checked the expression level of these receptor families in Arabidopsis thaliana during immunity.Notably, the RLKs, except for LRR-and Malectin-RLKs, generally exhibit higher expression levels compared to the RLPs during immunity (Main Fig. 1k; Supplementary Fig. 2).These data collectively suggest that the RLKs are more likely to be involved in immunity than the RLPs.
We also examined the expansion patterns of different receptor classes across various plant lineages.Our analysis involved calculating the median percentage (%) of cell-surface receptors families in (i) Glaucophyta and Rhodophyta, (ii) green algae, and (iii) Bryophytes to determine the percentage increase (% increase; see methods and Supplementary Note 2) from Glaucophyta and Rhodophyta to green algae; from green algae to Embryophytes; and between Bryophytes and Tracheophytes.We observed substantial expansions in specific receptor families across these lineages.Green algae exhibited a significant expansion of LRR-RLKs, while Embryophytes displayed expansions in LRR-RLPs, LRR-RLKs, WAK-RLKs, and G-lectin-RLKs.Tracheophytes had further expansions in LRR-RLPs, WAK-RLKs, Malectin-RLKs, G-lectin-RLKs, and Duf26-RLPs (Fig. 2; Supplementary Fig. 3).Overall, RLKs demonstrate greater expansion compared to RLPs, with notable expansions observed in LRR-RLK, WAK-RLK, and Glectin-RLK.In addition, the LRR-RLP family has also significantly expanded throughout the plant lineage.These findings align with the substantial size of these receptor families and their involvement in recognising pathogens (Fig. 1i; Supplementary Fig. 1b).LRR-RLKs are Structures of FLS2, CERK1, and FERONIA were published 55,95,96 .Structures of LORE, DORN1, WAK1 and CRK28 were predicted by Alphafold2* 97 .Ectodomains are visualized in iCn3D 98 .h Schematic displays the domain architecture of different classes of receptor-like kinase (RLKs) and receptor-like proteins (RLP) in plants.Arrows represent the ligands that these receptor classes have been reported to perceive or recognize.The upper box defines the ligands recognized by different receptors.The lower box defines the domains within the receptor classes.Note that these receptors may be able to recognise other unidentified ligands.For more information, see Supplementary Fig. 1. i Ectodomain distribution of RLKs in plants.Each fraction represents the percentage (%) of ectodomain out of all the RLKs from 350 species (177,645).j Ectodomain distribution of RLP in plants.Each fraction represents the percentage (%) of ectodomains out of all the RLPs with those seven ectodomains (41,144).k Table of RLKs and RLPs with LRR (red), G-lectin (orange), WAK (turquoise), Duf26 (blue), L-lectin (purple), LysM (green), and Malectin (magenta) ectodomains.Characterised receptors involved in microbial interaction (bacteria icon), reproduction (flower icon), and development (leaf icon) are indicated with light green boxes.Grey boxes indicate that the receptor class has not been reported to be involved in that biological process.For details, refer to Supplementary Fig. 1.Correlations between different classes of cell-surface receptors and NB-ARC in 300 angiosperms are indicated with bars.Strong positive correlations are indicated by extension to the light green area (Pearson's r > 0.6) and medium positive correlations are within the yellow area (Pearson's r between 0.3 and 0.6).Expression level^refers to the expression of each class of cell-surface receptors during NLR-triggered immunity (NTI) in Arabidopsis thaliana.Light blue area represents increased expression and light pink area represents decreased expression during NTI.X-axis values represent log 2 (fold change during ETI relative to untreated samples).Boxplot elements: center line, median; bounds of box, 25th and 75th percentiles; whiskers, 1.5 × IQR from 25th and 75th percentiles.Number of cell-surface receptors (n) analysed in the RNA-seq data: LRR-RLK, n = 159; LRR-RLP, n = 42; G-lectin-RLK, n = 29; G-lectin-RLP, n = 1; WAK-RLK, n = 18; WAK-RLP, n = 10; Duf26-RLK, n = 33; Duf26-RLP, n = 7; L-lectin-RLK, n = 21; L-lectin-RLP, n = 4; LysM-RLK, n = 5; LysM-RLP, n = 2; Malectin-RLK, n = 13; Malectin-RLP, n = 4. RNA-seq data analysed here were reported previously, where NTI was activated by estradiol-induced expression of AvrRps4 in A. thaliana for 4 h 94 .For the expression of each class of cell-surface receptors during PTI in A. thaliana, refer to Supplementary Fig. 2. classified into 20 subgroups based on their kinase domains, with subgroup XII specifically implicated in PAMP recognition 15 .In particular, the LRR-RLK-XII subgroup exhibits a considerably higher expansion rate compared to other subgroups (Supplementary Fig. 5), reinforcing the idea that cell surface immune receptors underwent extensive expansions as the plant lineage diversified and evolved to adapt to a wide range of environments.proximity, initiating a cascade of auto-and trans-phosphorylation events 16 .The activated receptor complex subsequently phosphorylates members of the cytoplasmic receptor-like kinases subgroup VII (RLCK-VII) 17 , which, in turn, phosphorylate various cytoplasmic kinases, such as the mitogen-activated protein kinase kinase kinases (MAPKKKs), calcium-dependent protein kinases (CDPKs) and plasma membrane-associated proteins, such as cyclic nucleotide-gated channels (CNGCs), hyperosmolality-gated calcium-permeable channels (OSCAs), and NADPH oxidases (RBOHs) 16 .The phosphorylation of these proteins collectively triggers transcriptional reprogramming and physiological changes, such as cytoplasmic calcium influx and the accumulation of reactive oxygen species (ROS) 18 .These physiological responses effectively hinder pathogen proliferation during infection (Fig. 2a; Supplementary Figs. 6 and 7).We identified cell-surface co-receptors and signalling components from the 350 genomes and determined their absence or presence across the plant lineage (Supplementary Fig. 7).SERKs, acting as cell-surface co-receptors for multiple LRR-RLKs and LRR-RLPs are present in Zygnematophyceae and Embryophytes 19 (Fig. 3b; Supplementary Fig. 8a, b), suggesting their emergence during or prior to the appearance of land plants.Immune-related LRR-RLPs lack intracellular kinase domains, thus require another LRR-RLK co-receptor, SOBIR1, to activate downstream signalling 20 .Similar to BAK1, SOBIR1 is also present in Embryophytes (Fig. 3b).Thus, co-receptors for cell-surface receptors likely evolved during or before the emergence of land plants.On the other hand, cytoplasmic kinases (RLCKs, CDPKs, MAPKKKs, MAPKKs, and MAPKs) are ancient, as are the PM-localised downstream signalling components (CNGCs, OCSAs, and RBOHs), found across all plant lineages (Fig. 3b; Supplementary Fig. 8).Although the exact function of these proteins in algal species remains unclear, their immune-related orthologs are present in green algae (Fig. 3b; Supplementary Fig. 8.This suggests that they underwent specialisation within the immune activation pathway prior to the emergence of land plants.The EP proteins (EDS1, PAD4, and SAG101) and RPW8-NLRs (NRG1 and ADR1) that are essential for both TIR-NLR and LRR-RLP mediated-immunity 21,22 , are only present in gymnosperms and angiosperms (seed plants) 23 .Considering the ancient nature of the LRR-RLPs, it is plausible that EP-protein and helper-NLRs were integrated into the LRR-RLP-signalling pathway, forming a robust immune network in seed plants.
Our investigation of the expansion rate of signalling components within the plant lineage indicated an expansion of CDPKs in green algae and expansion of RLCK-VIIs in Tracheophytes (Fig. 3b; Supplementary Fig. 8; Supplementary Note 2).However, other families of signalling components exhibit more limited expansions, compared to cell-surface receptors.This is also consistent with the considerably larger family sizes of cell-surface receptors and NLRs in comparison to the signalling components (Supplementary Fig. 9).Furthermore, we examined the correlation between the percentages of signalling components and PRRs (LRR-RLK-XIIs + LRR-RLPs) across genomes.Except for CNGCs, EP proteins, and RPW8-NLRs (0.6 > Pearson's r > 0.3), most signalling component families do not exhibit coexpansion or co-contraction with PRRs (Supplementary Fig. 9).Thus, we concluded that plants are more likely to evolve new receptors rather than downstream signalling components for adaptation.The RLCK-VIIs are further classified into ten subgroups which are differentially required for RLKs and RLPs to activate downstream responses 17,[24][25][26][27] (Supplementary Figs. 6 and 7).Similarly, CDPKs fall into 4 subgroups (Supplementary Fig. 8).RLCK-VII and CDPK subgroup members are differentially required by different PRRs to activate downstream responses [28][29][30][31] .Pathogens often target RLCK-VIIs through secreted effectors to suppress immunity [32][33][34] .Thus, redundancy among RLCK-VII subgroups serves as a protective mechanism for the downstream signalling pathway against effector targeting.In addition, plants have evolved RLCK-VII pseudokinases, or 'decoys', to guard functional RLCKs through NLRs [34][35][36][37][38] .Together, it has become apparent that the expansion of RLCK-VII families may have been driven by pathogenic pressure, thereby contributing to the enhanced robustness of the immune signalling network.
To investigate the functional necessity of IDs, we analysed ectodomains of LRR-RLPs and LRR-RLKs from 350 species (113,794) (Fig. 4a and Supplementary Fig. 12a, b).Employing multiple prediction programs, we identified gaps between LRR motifs ranging from 10-29 or 30-90 amino acids (AA) (Supplementary Fig. 12a).Since NLs typically span 6-30AA and IDs range from around 40-75AA, we focused on small gaps (10-29AA) corresponding to NLs, and large gaps (30-90AA) indicative of IDs (Supplementary Fig. 10).Small or large gaps are relatively infrequent in LRR-RLKs (10.6% and 5.43%, respectively) (Fig. 4a).In contrast, both small and large gaps are more prevalent in LRR-RLPs (28.3% and 61.6%, respectively).Furthermore, both LRR-RLKs and LRR-RLPs typically have only one gap, which can be either small or large (Supplementary Fig. 12c, d).Our analysis also showed that small gap positions within the ectodomains of both LRR-RLKs and LRR-RLPs are not fixed, but may be distributed randomly.Conversely, larger gaps are predominantly positioned before the last four LRR motifs in the ectodomain (51.2% for LRR-RLKs and 86.9% for LRR-RLPs) (Fig. 4b, c and Supplementary Fig. 13).Thus, our findings suggest functional requirement for IDs to be positioned before the last four LRRs.
To further validate the functional conservation of C3 regions in LRR-RLK-Xb and LRR-RLPs, we performed functional analysis of the QxxT/S motif in the LRR-RLP RLP23 from A. thaliana.RLP23 forms heteromeric complexes with the LRR-RLK co-receptor SOBIR1, and upon the perception of the nlp20 peptide, BAK1 is recruited into the complex, leading to activation of the SOBIR1 KD to induce The top panel is a sequence similarity tree of multiple algal and plant lineages.Circles (○) and stars (☆) indicate the origins and expansion of receptor families, respectively.The timescale (in million years; MYA) of the sequence similarity tree was estimated by TIMETREE5 99 .The bottom panel displays the presence or absence of receptor classes in different algal and plant lineages.*M/ C/K represents Mesostigmatophyceae, Chlorokybophyceae, and Klebsormidiophyceae.The number of available species from each algal and plant lineage is indicated within the respective boxes.A grey box indicates the absence, while a green box indicates the presence of a given protein family in each lineage.Dark green indicates the presence of orthologs of immunity-related (PTI) signalling components within that protein family (see also Supplementary Fig. 8).The origin of a protein family is indicated with a circle (○), followed by another circle indicating the origin of the orthologs of PTI-signalling component.Expansion rates of PTI-signalling component families are indicated by boxplots.The percentages (%) of signalling components from each genome were calculated as (number of identified genes/number of searched genes × 100).Next, the percentages from each species within a lineage (e.g, Rhodophtya or green algae) were grouped and the median percentage was calculated.Median value was used instead of mean to avoid outliers within the lineages.The expansion rate within a species is calculated by ((% signalling components in that species)-(median))/(median).The cyan boxplot represents the expansion rate from Glaucophyta and Rhodophyta to green algae (SERKs, n = 0; SOBIR1, n = 0; RLKCs (VII), n = 0; CDPKs, n immunity 20,49,56 .Similar to RXEG1 (with TQLQT), RLP23 possesses a TQITG motif (TQxxx), while LRR-RLK-Xb members, such as PSY1R, PSKR1, PSKR2, and BRI1, feature TQFDT, GQFQT, GQFYS, and GQFET motifs, respectively (all QxxT/S).Notably, the LRR-RLK-XII member EFR lacks the QxxT/S motif in that position, having GVFRN instead (Fig. 5d, e).We generated chimeric constructs of RLP23 with the terminal LRR motifs swapped between PSY1R, PSKR2, BRI1, and EFR (Fig. 5e, f).By immuno-precipitation assays, we tested the ability of these chimeras to interact with BAK1 upon ligand perception.Both wildtype RLP23 (WT) and RLP23-PSY1R (PY) chimeras can interact with BAK1 upon nlp20 treatment, whereas RLP23-BRI1 (BR) and RLP23-EFR (EFR) cannot (Fig. 5g).This suggests that the terminal LRR motif of RLP23 and PSY1R bind BAK1 in a similar manner.All chimeras can interact with SOBIR1 regardless of the presence of nlp20, indicating Rhodophyta ( 5) Chlorophyta ( 13) M/C/K* (3) Charophyta ( 1) Zygnematophyceae ( 3) Bryophyta ( 5) Anthocerotophyta ( 3) Lycophytes ( 1) Polypodiophyta ( 3) Gymnosperms ( 10 that the last LRR motif may not be involved in SOBIR1 interactions (Fig. 5g).Furthermore, upon nlp20 treatment, both WT and PY can trigger immune responses, while K2 and BR cannot (Fig. 5h, i).We speculate that this may be due to the absence of a specific T residue before the Qxxx motif in PSKR2 and BRI1 (G instead of T), which is relatively less prevalent within the LRR-RLP clade (31.1%).Multiple A. thaliana LRR-RLPs contain this residue, but it is not in other studied LRR-RLPs, such as the tomato Cf proteins, suggesting that this residue evolved in some species after the divergence of LRR-RLK-Xb and LRR-RLPs.To test this hypothesis, we generated chimeric constructs of RLP23 with the terminal LRR motifs swapped to PSKR1 (K1), PSKR2 (K2), BRI1 (BR), and mutated the G residue into T (thereby K1 T , K2 T , BR T ; Fig. 5e, f).We tested the immune responses triggered by these chimeras following nlp20 treatment.Both K1, K1 T , and K2 T induce relatively immune responses compared to WT and PY, while K2, BR, and BR T are unable to induce any immune responses (Fig. 5j, k).Both K1 and K1 T weakly interact with BAK1 upon nlp20 treatment, while K2 does not interact with BAK1 (Fig. 5k).Thus, the residues around and within the TQxxx motif are both crucial for SERK interactions.We concluded that the C3 region in LRR-RLPs and some LRR-RLK-Xbs (such as PSY1R and PSKR1) interact with SERKs in a similar manner, while some LRR-RLK-Xbs (PSKR2 and BRI1) have evolved to interacts with SERKs in a slightly different manner.Nevertheless, our results strongly support the functional conservation of C3 regions in LRR-RLK-Xbs and LRR-RLPs, specifically their ability to interact with SERKs.

On the origin of LRR-RLPs and LRR-RLK-Xbs
The functional conservation of C3 regions in LRR-RLK-Xbs and LRR-RLPs implies that the ectodomains of these two receptor families might share a common origin.To dissect the ectodomain origin of LRR-RLPs and LRR-RLK-Xbs, we investigated the relatedness of the IDs between these two groups of receptors.First, we aligned the IDs extracted from both LRR-RLKs and LRR-RLPs (20,246).Remarkably, the ID clusters of LRR-RLPs and LRR-RLK-Xb are found in close proximity, mirroring the C3 sequence similarity tree (Fig. 6a).Again, the BRI1/BRI1-LIKE (BRL) and PSKR/PSY1R clusters are in proximity to the LRR-RLPs (Fig. 6a, b).These results are also consistent with a previous report that PSKRs are closely related to some LRR-RLPs in Arabidopsis and rice 46 .To dissect the ectodomain origin of LRR-RLPs and LRR-RLK-Xbs.A recent review identified two conserved lysine k-containing motifs, Yx8KG and Kx5Y, in the ID of LRR-RLPs 40 .The lysine residue in the Kx5Y motif from RXEG1 is required for its interaction with BAK1 53 .We identified both Yx8KG and Kx5Y motifs in the extracted IDs (before the last 4LRR motifs) of LRR-RLKs and LRR-RLPs.More than 75% of IDs from LRR-RLPs contain at least one of these lysine motifs, while less than 5% of IDs from LRR-RLK-Xb have either motif (Fig. 6c).This is consistent with structural data indicating that IDs from BRI1 and PSKR1 employ distinct residues for their interactions with BAK1 48,50,52 .However, IDs from LRR-RLPs that are closely related to those from LRR-RLK-Xbs retain Kx5Y motifs (Fig. 5b).It is therefore possible that the common ID ancestor may have originally harboured lysine motifs that were subsequently lost from the BRI1 and PSKR/PSY1R clades.
To further explore the relatedness of the IDs between LRR-RLK-Xb group and LRR-RLPs, we formed clusters of highly similar IDs and examined their subgroup/family affiliations.Overall, we identified more than 2,822 clusters, with the majority (2,734) consisting of LRRreceptors of a single subgroup/family (Supplementary Fig. 15).Among the 61 clusters containing LRR-receptors from two different subgroups/families, 80% (49 clusters) consist of LRR-receptors from LRR-RLK-Xb and LRR-RLPs (Supplementary Fig. 15).The enrichment of LRR-RLK-Xb and LRR-RLP pairings provides further evidence for the relatedness of their ectodomains.The substantial number of LRR-RLP-only ID clusters (2,383) also suggests that the IDs of LRR-RLPs have undergone extensive evolution and diversification, in the process providing a broad scope for the recognition of PAMPs and apoplastic effectors.Most clusters contain relatively a small number of IDs, predominantly from species within the same order or family (Supplementary Data 2).Consequently, tracing back the origin of IDs is challenging due to their considerable diversity.We therefore propose that the IDs of LRR-RLK-Xbs and LRR-RLPs likely originated from a common ancestor, with the IDs of LRR-RLPs expanding and diversifying after the divergence of LRR-RLK-Xb and LRR-RLPs.
Within the BRI1/BRI1-LIKE (BRL) and PSKR/PSY1R clusters, we found LRR-RLP counterparts of BRI1, PSKR, and PSY1R.To further test their relatedness, we aligned the full ectodomains from LRR-RLKs and LRR-RLPs and extracted the clades containing BRI1, PSKR, and PSY1R from the sequence similarity tree (Supplementary Fig. 16a, b).Within the BRI1/BRL-and PSKR-ectodomain clades (Fig. 6e, h), we found ectodomains from LRR-RLPs that are highly similar to BRL1/BRL3 and PSKR2, with residues essential for BR and PSK binding, respectively 48,54 (Fig. 6d-l).Within the PSY1R-ectodomain subclade, we found multiple IDs from LRR-RLPs that share remarkable similarity with AtPSY1R (Supplementary Fig. 17b, c).AtRLP3 and AtPSY1R have over 70% sequence identity and 85% similarity in their ectodomains.Although PSY1R is not the receptor for PSY peptide 57 , it is possible that RLP3, and PSY1R recognise similar or identical ligands 58 .Currently, the functions of these BRL-, PSKR-and PSY1R-like LRR-RLPs remain unclear.AtRLP3 confers resistance against the vascular wilt fungus Fusarium oxysporum f. sp.matthioli 58 , while AtPSY1R is involved in growth and development 43 .We propose that these RLPs may either recognise endogenous molecules to activate growth and development, or participate in the recognition of pathogen-mimicking molecules to trigger immune signalling.

Specialisation of cell-surface receptors in different biological processes
Given the common origin of the ectodomains of LRR-RLPs and LRR-RLK-Xbs, these receptors must have undergone specialisation in immuneand developmental processes following their divergence.While LRR-RLK-Xbs activate downstream responses through the Xb kinase domain, LRR-RLPs recruit SOBIR1 to trigger immunity 49,56 .Because LRR-RLPs lack a kinase domain, the juxtamembrane (JM) and TM regions may activate immune responses.We therefore aligned the C3, eJM (JM region before TM), TM, and cJM (JM region after TM, but with the absence of kinase domain from RLKs) regions from LRR-RLKs and LRR-RLPs (with a subset of 40, or the full set of 62,896 cell-surface receptors; Fig. 7a-c).Interestingly, the eJM-TM-cJM region of LRR-RLPs is not closely related to LRR-RLK-Xbs (Fig. 7c).Previous studies have reported the requirement of Glycine-X-X-X-Glycine (GxxxG) motifs in SOBIR1 for its association with LRR-RLPs, and the potential contribution of negatively charged amino acids in the eJM region of LRR-RLPs to their interaction with SOBIR1 59 .Consistent with previous reports 59 , LRR-RLPs are strongly negatively charged at the end of the eJM region, whereas LRR-RLKs, including SOBIR1, are positively charged at the end of the eJM region  (Fig. 7a).Most LRR-RLPs have a single GxxxG motif, with some having two (GxxxGxxxG) or three (GxxxGxxxGxxxG) consecutive motifs (Fig. 7a).Conversely, GxxxG motifs are relatively less common in LRR-RLKs, but can be found in SOBIR1, BAK1, and the PSKR/PSY1R clade (Fig. 7a).We further examined overall charges in the eJM region of the LRR-RLP and LRR-RLK families.Most LRR-RLK subgroups feature positively charged eJM regions, whereas LRR-RLPs have negatively charged eJM regions.Importantly, negatively charged eJM regions are only present in LRR-RLPs within the ID + 4LRR clade (Fig. 7d).More than 80% of LRR-RLPs within the ID + 4LRR clade contain GxxxG motifs in their TMs.Moreover, GxxxGxxxG motifs are primarily found in LRR-RLPs and are relatively rare in LRR-RLKs (Fig. 4d).Interestingly, both GxxxG and GxxxGxxxG motifs are relatively enriched in the PSKR/PSY1R clade, indicating that the TM regions of LRR-RLPs and PSYR/PSY1R clade might also share a common origin.(Fig. 7d).
To assess the functionality of eJM-TM-cytosolic regions in LRR-RLK-Xbs and LRR-RLPs, we generated multiple chimeras of eJM, TM, and cytosolic regions of BRI1 and RLP23 (Fig. 8a).In RLP23 c-BRI1 , the cytosolic region (following TM) of BRI1 was swapped into RLP23.In RLP23 TM-BRI1 , the TM+ cytosolic region of BRI1 was swapped into RLP23.In RLP23 eJM-BRI1 , the eJM + TM + cytosolic region of BRI1 was swapped into RLP23 (Fig. 8a).Immuno-precipitation assays showed that both RLP23 WT and RLP23 c-BRI1 exhibit constitutive interactions with SOBIR1, and specifically interact with BAK1 upon nlp20 treatment.RLP23 TM-BRI1 does not interact with SOBIR1 but still interacts with BAK1 upon nlp20 treatment, suggesting that the RLP23 TM region with GxxxG motifs is necessary for SOBIR1-, but not BAK1-interactions (Fig. 8b).Consistent with these immunoprecipitation assays, RLP23 WT and RLP23 c-BRI1 can activate immune responses, while RLP23 c-BRI1 and RLP23 TM-BRI1 can activate developmental responses as indicated by the dephosphorylation of BRI1-EMS-SUPPRESSOR 1 (BES1) 60 (Fig. 8c; Supplementary Fig. 18).This confirms that the BRI1 kinase domain is specifically required for the activation of BR responses.Interestingly, RLP23 TM-BRI1 can also activate weak immune responses, likely independent of SOBIR1 (Fig. 8c, d).RLP23 eJM-BRI1 does not interact with SOBIR1 or BAK1, and is unable to activate either immune or developmental responses (Fig. 8c, d).These data suggest that the eJM region of RLP23 is necessary for RLP23-BAK1 interactions.This conclusion was reinforced by generating RLP23 eo-BRI1 chimera, in which the eJM region of RLP23 is replaced with the BRI1 eJM (Fig. 8a).RLP23 eo-BRI1 consistently accumulates less protein than RLP23 WT .Moreover, RLP23 eo-BRI1 constitutively interacts with SOBIR1, but fails to interact with BAK1 with nlp20 treatment (Fig. 8e).RLP23 eo-BRI1 only weakly activates MAPKs and fails to trigger ROS production with nlp20 treatment (Fig. 8f, g).Thus, we conclude that the eJM regions of LRR-RLPs are important for protein accumulation and interaction with BAK1 upon ligand perception.The eJM, TM, and cytosolic region of LRR-RLPs and LRR-RLK-Xbs have indeed specialized to recruit different proteins into the receptor complex, which allows them to activate differential downstream responses (Fig. 8h).

Discussion
The ectodomains of LRR-RLPs and LRR-RLK-Xbs appear to share a common evolutionary origin, suggesting an ancestral adoption of the ID + 4LRR architecture for ligand recognition and interactions with coreceptors such as the SERKs.Since both LRR-RLPs and LRR-RLK-Xbs rely on the terminal 4LRR/C3 region for SERK interaction 48,50,52,53 , it is plausible that the C3 region has undergone stabilising evolution to preserve this functionality (Fig. 8i).Consequently, a portion of the C3 region in LRR-RLK-Xb and LRR-RLP can be interchanged without loss of function.However, the remaining ectodomain, including the LRR motifs preceding the ID and the ID itself, has undergone diversification and adaptation to recognise distinct ligands (Fig. 8i).For example, the perception of a herbivore-associated peptide (inceptin) by inceptin receptor (INR) emerged specifically in legume species around 28 million years ago 61 .As a result of this rapid diversification, we were unable to trace back to the common ancestor of LRR-RLK-Xb and LRR-RLP.Notably, certain LRR-RLPs exhibit significant sequence similarities to LRR-RLK-Xbs (BRI1/BRL, PSKR, and PSY1R), though their specific functions remain to be explored.Following diversification, the eJM, TM, and cytosolic regions of LRR-RLK-Xb and LRR-RLP acquired distinct roles, primarily in development and immunity, respectively (Fig. 8i).The eJM and TM regions of LRR-RLP specifically facilitate constitutive interactions with SOBIR1 and liganddependent interactions with BAK1, whereas the LRR-RLK-Xb group lacks such specialisation.Since the recruitment of SOBIR1 to BRI1 would lead to immune activation, there should be negative selection for negatively charged eJMs and GxxxGs in the TM regions to prevent immune activation by LRR-RLK-Xbs.Given the different roles of various domains and regions of LRR-RLKs and LRR-RLPs, we propose a model in which different domains or regions of cell-surface receptors undergo modular evolution to either diversify or maintain their original functions.Modular evolution allows the specialisation of cell-surface receptors to recognise different ligands and to activate distinct downstream signal responses while maintaining interactions with co-receptors (Fig. 8i).This model is consistent with the observation that LRR-RLKs have undergone substantial structural evolution to generate novel receptors 14 .Hydrogen bonds are indicated by green dotted lines, and salt bridges are shown as cyan dotted lines.The positions of LRR residues (counting from N to C for SERKs and counting from C to N for LRR-RLKs and LRR-RLP) are shown.Amino acid residues that are important for the interactions are labelled and the QxxT motifs are highlighted in yellow (red text).The right panel represents the 2D interaction network between SERKs and the receptors.Contacts/interactions are shown in grey lines, hydrogen bonds are shown in green lines, and salt bridges are shown in cyan lines.Amino acids are labelled in colours according to their positions in the LRR motifs (counting from N to C for SERKs and counting from C to N for LRR-RLKs and LRR-RLP l).Residues around and within the QxxT motifs in BRI1, PSKR1, and RXEG1 are highlighted in yellow.Residues in SERKs that are involved in the interactions with QxxT motifs are also highlighted in yellow.Structures were visualized in iCn3D 98 .For a-c, the interaction sites are calculated by iCn3D with the following thresholds: hydrogen bonds: 4.2 Å; salt bridges/ionic bonds: 6 Å; contacts/interactions: 4 Å.d Structure of the terminal LRR motif of N. benthamiana (Nb)RXEG1, A. thaliana (At)RLP23, AtPSY1R, AtPSKR1, AtPSKR2, AtBRI1 and AtEFR.Structures of NbRXEG1, AtPSKR1, and AtBRI1 were published 48,[50][51][52][53][54] .Structures of AtRLP23, AtP-SY1R, AtPSKR2 and AtEFR were predicted by Alphafold2 97 .Ectodomains are visualised in iCn3D 98 .e Alignment of amino acids in the last LRR motifs from NbRXEG1, AtRLP23, AtPSY1R, AtPSKR1, AtPSKR1 T (G > T), AtPSKR2, AtPSKR2 T (G > T), AtBRI1, AtBRI1 T (G > T), and AtEFR.Amino acid residues involved in the interaction between NbRXEG1 and BAK1 are highlighted in green.The QxxT motif positions are highlighted in yellow.Amino acids with similar properties to AtRLP23 are highlighted in grey.f Design of AtRLP23 chimeras.The last LRR motif of AtRLP23 is exchanged with either AtPSY1R, AtPSKR1, AtPSKR2, AtBRI1, or AtEFR.The glycine g residues in AtPSKR1, AtPSKR2, AtBRI1 have also been mutated to threonine (T).g, l Immuno-precipitation to test interactions between AtRLP23 chimeras, AtBAK1 and AtSOBIR1.Nb leaves expressing the indicated constructs were treated with either mock or 1 μM nlp20 for 5 min.h-k Functionality testing of AtRLP23 chimeras.Nb leaves expressing the indicated constructs were treated with 1 μM nlp20 and samples were collected at indicated time points.Phosphorylation of NbSIPK and NbWIPK was detected with p-P42/44 antibody.i, k Nb leaf discs expressing the indicated constructs were collected and treated with either mock or 1 μM nlp20, and ROS production was measured for indicated time points.For i and k, solid line, mean; shaded band, s.e.m.RLU, relative light units.For details of experiential design in g-l, refer to the methods section.For g, h, j and l, the experiments were repeated at least twice with similar results.
The presence of diverse cell-surface receptor classes and multiple downstream signalling components in algal species suggests that algae may have pathogen-sensing system (Fig. 5a), as indicated by the PAMPs triggering defence responses in some algal species 62 (Supplementary Note 2).Most cell-surface receptors and downstream signalling components in the Tracheophytes are conserved in Bryophytes.(Fig. 9a).Therefore, the most recent common ancestor of land plants is likely to possess a considerable number of cell-surface receptors and the basic components of a signalling network.We also observed a significant expansion of cell-surface receptors families in land plants, which likely facilitated the adaptation to terrestrial environments (Fig. 9a).Our work has uncovered multiple evolutionary mechanisms underlying cell-surface receptors to facilitate plant adaptations: (i) Expansion of the number of receptors and their recognition specificity.The expansion of cell-surface receptors subgroups, including LRR-RLKs, LRR-RLPs, G-lectin-RLKs, and WAK-RLKs, has enabled plants to recognise a broader range of molecules specific to certain stresses and environmental signals.(ii) Development of an increasingly complex signalling network.Cell-surface co-receptors, such as SERKs and SOBIR1 emerged during or around the time of land plant evolution.Moreover, several cell-surface receptors involved in signalling regulation, such as malectin-RLKs (FERONIA) 63 and LRR-RLK-Xa (BIR1, BIR2) [64][65][66][67] are found exclusively in Embryophytes.Cell-surface receptors utilise different co-receptors and their cytosolic kinase domains to differentially activate downstream signalling components, including RLCK-VIIs, CDPKs, and MAPKs, which fine-tune the magnitude and specificity of downstream responses 17,68 .Collectively it seems that increasingly intricate and specialised signalling networks enhance the flexibility and regulation of differential responses to keep up with the rapidly changing environment 69 .iii) Adaptation of existing receptors for specific signalling (Fig. 9b).The structural similarities between LRR-RLPs and LRR-RLK-Xbs imply a common origin between immunespecific cell-surface receptors and development-specific cell-surface receptors.The exact nature of the common ancestral form of these receptors, whether an RLK or an RLP, remains, and perhaps will always remain uncertain.Both LRR-RLK-Xbs and LRR-RLPs with ID + 4LRR can be found in land plants, so it is conceivable that LRR-RLK-Xb, with its kinase domain predating land plant emergence, evolved from an integration of an LRR-RLP containing an ID + LRR into an Xb kinase domain 70 .In this scenario, the common ancestral receptors could have recognised a PAMP, with the peptide sequence of this PAMP possibly converted to serve as a phytocytokine to regulate plant developmental processes.Multiple phytocytokines, such as PSK, PSY, SCOOPs, and CLE peptides, are present in plant pathogens and pests [71][72][73][74][75] .Whether the perception of phytocytokines evolved from the perception of PAMPs, or pathogens developed phytocytokine-mimics to repress immune responses remains an open question.

RLK, RLP, and ectodomain identification
We used the same sequences described in the previous publication 4 .We also used the LRR-RLK, LRR-RLP, LysM-RLKs, and Nb-ARC sequences described in the previous publication 4 and did not search them again.
The initial set of proteins included only the primary gene models from all 350 species (12,979,225 proteins in total).Prior to any further HMM searches, sequences were filtered for a minimal length of 250 AA in the case of LRR-RLKs (7,690,505 proteins) or 150 AA in the case of all others (10,224,242 proteins).Based on the presence of a kinase domain (KD) and/or a trans-membrane domain (TM), the proteins fell into three major groups: (1) RLKs with KD and TM, (2) RLPs without KD but with TM, and (3) ectodomain candidates without KD or TM.
To identify RLK candidates, we first searched for the presence of a protein kinase domain (PFAM: PF00069.26) with hmmer (version 3.1b2, option -E 1e-10 76 , 439,075 proteins found).If multiple hits were found, only the best match was kept.Potential signalling peptides were removed with SignalP (version 5.0b 77 ) to avoid identifying and including signal peptides as TMs (in 139,628 out of the 439,075 candidates).TMs were searched with tmhmm and only candidates with 1-2 TMs were kept, leaving 177,645 proteins (version 2.0 78 ).The locations of the KD and the TM were used to split the protein sequence into the endodomain (with KD) and ectodomain (without KD) 4,79 .
To identify RLPs and ectodomain candidates (without KD or TM), we first removed all proteins with a kinase domain match (hmmer with the option -E1000), leaving 9,746,585 from the initial 10,224,242 proteins.We next removed potential signalling peptides from the sequences with SignalP because they are sometimes identified as TM domains (trimmed 796,385 sequences).We then searched for TMs with tmhmm and kept proteins with no TM (7,917,087, ectodomains) or 1-2 TM (962,223 RLPs).
To test whether RLPs contained potentially functional endodomain sequences, we extracted endo-and ectodomain sequences.4d.In the outmost three rings, the presence of Kx5Y (Kx5Y), Yx8K (Yx8K), or either Kx5Y or Yx8K (Either) in the IDs are indicated in green, and absence of these motifs is indicated in grey.c Percentage (%) of IDs (before the last four LRR motifs) from LRR-RLP and LRR-RLK-Xb with the Kx5Y, Yx8KG, or either Kx5Y or Yx8KG (*) motifs.The Fisher test (2-sided) was performed to compare the number/fraction of IDs with either Kx5Y or Yx8KG in LRR-RLPs against LRR-RLK-Xb.The calculated p-value stated here (<1e-16) is too low to be given exact number.Thus, the upper bound limit is stated instead.d, g Structures and interaction interfaces of AtBRI1-Brassinosteroid (BR) and AtPSKR1-Phytosulfokine (PSK).Published structures of d AtBRL1-BR 100 , g AtPSKR1-PSK 48 are shown.The left panels show the interaction sites between the LRR-RLK-Xb receptors and their ligands.Contacts are indicated by grey lines, hydrogen bonds are indicated by green dotted lines, and salt bridges are shown as cyan dotted lines.The positions of LRR residues (counting from C to N) are shown.Amino acid residues that are important for the interactions are highlighted in yellow and green, respectively.The right panel represents the 2D interaction network between the LRR-RLK-Xb receptors and their ligands.Contacts/interactions are shown in grey lines, hydrogen bonds are shown in green lines, and salt bridges are shown in cyan lines.Amino acids are labelled in colours according to their positions in the LRR motifs (counting from C to N). Structures were visualized in iCn3D 98 .e, h Sequence similarity trees of the full-ectodomains of e BRI1/BRL clade and h PSKR clade from 350 species.Branches are labelled in colours as indicated in a.These trees are extracted from the BRI1/BRL and PSKR branches from Supplementary Fig. 16.In e, h, characterized LRR-RLK-Xb members are labelled.The LRR-RLK-Xb and LRR-RLP members taken for the alignment in f and i are also labelled in blue numbers (LRR-RLK-Xb) and pink numbers (LRR-RLP), respectively.f, i Ectodomain and alignment of multiple LRR-RLK-Xb and LRR-RLP members within the BRI1/BRL-clade extracted from e and the PSKR-clade extracted from h. f The alignment of ectodomain from LRR-RLK-Xb (blue) and LRR-RLP (pink) members taken from the sequence similarity tree in e.The orange highlights indicate the amino acids residues required from BR binding in AtBRI1, and the yellow highlights indicate the amino acid residues required for BR binding in AtBRL1 100 .The yellow highlights are corresponding to the amino acids highlighted in the structure in d.The LRR motifs and ID in the alignment are indicated in colours shown in d (the interaction network; right panel).i The alignment of ectodomain from LRR-RLK-Xb (blue) and LRR-RLP (pink) members taken from the sequence similarity tree in h.The green highlights indicate the amino acids residues required from PSK binding in AtPSKR1 48 .The green highlights are corresponding to the amino acids highlighted in the structure in g.The LRR motifs and ID in the alignment are indicated in colours shown in g (the interaction network; right panel).Due to space limitation, the last 9 amino acids in each LRR motif are presented as * in the alignment.For the full alignment, refer to Supplementary Fig. 17a.
RPW8-NLRs (NRG1 and ADR1) were identified similarly using the NB-ARC (PFAM: PF00931.23) and RPW8 (PFAM: PF05659.12)domains.After clustering as described above, we found all known NRG1 and ADR1 sequences in one single cluster.This allowed us to extract and recluster these sequences with more stringent parameters (options --min-seq-id 0.3 -c 0.75).After that, we found all known NRG1 (AT5G66900, AT5G66910, and Niben101Scf02118g00018.1) and ADR1 (AT1G33560, AT4G33300, AT5G04720, and Niben101Scf02422g02015) proteins in exactly one cluster each, indicating that the two matching clusters could be considered as NRG1 and ADR1 proteins.

Expansion rate of cell-surface receptors and signalling components
The percentages (%) of cell-surface receptors and signalling components from each genome were calculated as (number of identified genes/number of searched genes × 100).Next, the percentages from each species within a lineage (e.g., Rhodophtya or green algae) were grouped and the median percentage was calculated.Median value was used instead of mean to avoid outliers within the lineages.The expansion rate within a species is calculated by ((% cell surface receptors or signalling components in that species)-(median))/(median).For example, the expansion rate of LRR-RLP family in Marchantia polymorpha from green algae is calculated by ((%LRR-RLP in Marchantia polymorpha)-(median %LRR-RLP in green algae)/(median %LRR-RLP in green algae).Values larger than 0 indicate expansion; values equal to 0 indicate no expansion, and values below 0 indicate contraction.Note that the reliability of the expansion rate is dependent on the number of species used to calculate the median, which is also dependent on the available genomes in Glaucophyta, red algae (Rhodophyta), green algae, and Bryophytes.

Identification of N-loop outs (NLs) island domains (IDs)
To identify N-loopouts (NLs) or island domains (IDs) in LRR-RLK and LRR-RLP proteins, we used a dataset of previously described LRR-RLKs and LRR-RLPs 4 .We searched the LRR-RLKs again for kinase domains (PFAM: PF00069.26) with hmmer (option -E 1e-10), and kept only the best match for each protein.LRR-motifs and transmembrane domains were searched in both groups with predict-phytolrr 79 and tmhmm 78 , respectively.LRR-RLKs were filtered for the presence of internal KD motifs, one or two TM, and at least two external LRR repeats ('internal' was defined as the side with the kinase domain).LRR-RLPs were filtered for the absence of a KD and the presence of one or two TM and at least two external LRR-motifs as defined by the site with more LRR repeats).The outer LRR-motifs were then used to identify NLs and IDs: Individual repeats were grouped into LRRregions if they were less than 13 AA apart from each other.Gaps between LRR-regions or LRR-motifs that were 15-29 AA or 30-90 AA long were extracted as NL and ID candidates, respectively.After extracting gap sequences, all sequences were again checked for the presence of LRR-motifs using predict-phytolrr and hmmer using all LRR patterns as described previously 4 , and LRRsearch 82 .Only NLs and IDs without any LRR match were included in the final dataset.4c) is highlighted in light green.Amino acid residues involved in brassinosteroid (BR) binding for BRI/BRL and residues required for SERK interaction for BRI1/BRL, PSKR1, and RXEG1 (LRR-RLP) are highlighted.The colour code for each highlight is indicated in the box on top.For the eJM region, amino acid residues with negative charges are highlighted in red, and amino acids with positive charges are in highlighted blue.The GxxxG motif in the TM region is highlighted in yellow.The interaction sites were calculated using iCn3D 98 with the following thresholds: hydrogen bonds: 4.2 Å; salt bridges/ionic bonds: 6 Å; contacts/interactions: 4 Å.For details, please refer Fig. Locations of the NLs and IDs in relation to LRR-motifs and LRRregions were determined with a custom R-script.ID sequences were aligned to each other with FAMSA 83 without trimming 84 and sequence similarity trees were inferred with FastTree 85 (version 2.1.11SSE3, option -wag).Trees were rooted with gotree 86 (v0.4.2) using one sequence belonging to the most basal species as outgroup, according to the taxonomic tree.The sequence similarity tree of the IDs was used to cluster the proteins: the tree was converted into a distance matrix using the function cophenetic.phylo()from the R-package 'ape' 87 (version 5.6-2).Distances smaller than 0.2 (i.e. less than 0.2 substitutions per site on average) were extracted, converted to similarities, and used as edges in a network.Communities within this similarity network were identified with the function cluster_louvain 88 implemented in the R-package 'igraph' 89 (version 1.2.6).

In-depth phylogeny of the ectodomains and other regions from LRR-RLPs and LRR-RLKs
In-depth analysis of the ectodomains from LRR-RLPs and LRR-RLKs was done using the LRR-RLKs and LRR-RLPs from the NLs and IDs search (see above).We first searched for the C3 domain in each sequence 4,46 with hmmer and selected the best hit.We then pruned the sequences to include everything from the C3 domain to the C terminus.For the LRR-RLKs, we further searched sequences for kinase domains and removed sequences upstream of the start of the kinase domain.That is, for all LRR-RLKs and LRR-RLPs, we extracted regions with C3, eJM, TM, or eJM domains.These sequences were aligned with FAMSA.Specific domains (e.g.C3 or eJM) were subsequently extracted from this alignment.After extraction, the sequence similarity trees of specific domains were constructed as described above (FAMSA, FastTree, gotree).

In-depth phylogeny of the ectodomains from all non-LRR candidates
In-depth analysis of the ectodomains from all Duf26, G-type lectin, L-type lectin, malectin/malectin-like, LysM, and WAK candidates was done using the ectodomain sequences extracted from the ectodomainonly proteins, RLKs, and RLPs.Ectodomain sequences from the ectodomain-only proteins were extracted based on the location of the hmm-pattern match.Phylogenies were constructed as described above with FAMSA, FastTree, and gotree (rooted with a sequence from the most basal species according to the NCBI taxonomy).

Taxonomic trees
Taxonomic trees used in this study were identical to the ones described previously 4 : The taxonomic tree for visualising the entire data set and selecting outgroups was obtained from NCBI (https://www.ncbi.nlm.nih.gov/Taxonomy/CommonTree/wwwcmt.cgi).The tree used for testing the relationship between the fraction of candidates found and phylogenetic distances, was obtained from a previous report 90 .The latter contained 238 out of the 351 genomes analysed.Sequence similarity trees were visualised and pruned, and figures were generated with iTOL 91 .

Test for similarities in fraction of proteins and phylogenetic relationships
Tests for similarities in the fraction of proteins and phylogenetic relationship were done as described previously 4 : To test whether the fraction of certain proteins found per species correlated with predicted phylogenetic relationships, we converted the fractions and the sequence similarity tree to distance matrices and tested for correlation with mantel tests (R-package vegan, version 2.5-7 with 10,000 permutations).Analogously, we also tested for correlation between distance matrices obtained for two different sets of proteins.P-values were corrected for multiple testing to reflect false discovery rates (FDRs 92 ).

Vector construction
The CDS regions of AtRLP23, AtBRI1, AtPSKR2, AtPSY1R, AtEFR, and AtBES1 were amplified by PCR with KoD one (Toyobo, Japan), and the PCR products were cloned into the epiGreenB5 (3× HA) vector between the ClaI and BamHI restriction sites with In-Fusion HD Cloning Kit (Clontech, USA) to generate p35S::BES1-HA or p35S::cell-surface receptor-HA (epiGreenB5-Cauliflower mosaic virus (CaMV) p35S:gene of interest-3 × HA).The constructs were then transformed into Agrobacterium tumefaciens strain AGL1 for transient expression in Nicotiana benthamiana.All chimeric cell-surface receptors generated in this study contain the EFR signal peptide to ensure consistency between the constructs and expression levels.

Transient expression in Nicotiana benthamiana
A. tumefaciens strain AGL1 carrying the binary expression vectors described above were grown on LB agar plates amended with selection antibiotics.Cultures were pelleted, centrifugated, and then resuspended in infiltration buffer (10 mM MgCl 2 , 10 mM MES at pH5.6, and 100 μM acetosyringone).The concentration of AGL1 was then adjusted to OD 600 = 0.5 and syringe-infiltrated into N. benthamiana leaves.

Protein extraction and immunoprecipitation
Protein extraction for immunoprecipitation was performed as previously described 93 .Three days after transient expression, three to four grams of N. benthamiana leaves were treated with elicitors and snapfrozen.The tissues were then ground in liquid nitrogen and extracted in extraction buffer (50 mM Tris-HCl at pH 7.5, 150 mM NaCl, 10% glycerol, 5 mM DTT, 2.5 mM NaF, 1 mM Na 2 MoO 4 •2H 2 O, 0.5% polyvinylpyrrolidone (w/v), 1% Protease Inhibitor Cocktail (P9599; Sigma-Aldrich), 100 μM phenylmethylsulphonyl fluoride and 2% IGEPAL CA-630 (v/v; Sigma-Aldrich), and 2 mM EDTA) at a concentration of 3 mL/g tissue powder.Samples were then incubated at 4 °C for an hour and debris was removed by centrifugation at 13,000 rpm for 10 min at 4 °C.Supernatants were collected, protein concentrations were adjusted to 5 mg/mL, then incubated with rotation for an hour at 4 °C with 50 μL anti-HA magnetic beads (Miltenyi Biotec) for immunoprecipitation.Magnetic beads were then washed twice with extraction buffer and the HA-tagged protein was eluted with sodium dodecyl sulphate (SDS) sample buffer at 95 °C.

Immunoblotting
Protein extractions were performed as previously described 94 .N. benthamiana leaves were infiltrated with elicitors and snap-frozen at Fig. 8 | Adaptation of LRR ID + 4LRR to differential downstream signalling pathways.a Design of AtRLP23-BRI1 chimeras.Different regions (cytosolic, TM+cytosolic, and eJM+TM+cytosolic) of BRI1 were swapped into AtRLP23 as indicated.The alignment of amino acids in the eJM and TM regions from AtRLP23 and AtBRI1 is shown.Amino acid residues with negative charges are in red and amino acids with positive charges are in blue.The GxxxG motif in the TM region is highlighted in yellow.b, e Immuno-precipitation to test interactions between AtRLP23 chimeras, AtBAK1 and AtSOBIR1.Nb leaves expressing the indicated constructs were treated with either mock or 1 μM nlp20 for 5 min.c, d, f, g Functionality testing of AtRLP23 chimeras.Nb leaves expressing the indicated constructs were treated with 1 μM nlp20 and samples were collected at indicated time points.Dephosphorylation of BESI1-HA was detected with HA antibody.Phosphorylation of NbSIPK and NbWIPK was detected with p-P42/44 antibody.For f, twice the sample of RLP23 oe-BRI1 was loaded as a reference, because RLP23 oe-BRI1 protein accumulation is weaker than that of RLP23 WT .d, g Nb leaf discs expressing the indicated constructs (without BES1-HA) were collected and treated with either mock or 1 μM nlp20 and ROS production was measured for 90 min.For d and g, solid line, mean; shaded band, s.e.m.RLU, relative light units.For details of experiential design, refer to the methods section.h Schematic model of the interaction between LRR-RLK-Xb ID+4LRR and LRR-RLP ID + 4LRR with co-receptors to induce differential downstream signalling.Both receptor classes utilise the last 4 LRRs (highlighted in yellow) to interact with SERKs (BAK1).LRR-RLP evolved to interact with SOBIR1 with the GxxxG motifs in TM (highlighted in yellow outline).Coloured hexagons on RLKs indicate activated kinases and black hexagon indicates an inactivated kinase.i Modular evolution of different domains in cell-surface receptors to allow diverse ligand recognition and specificity of downstream signalling.Domains or regions that evolved different functions are highlighted in yellow.Bold arrows represent large expansions and diversifications.K* represents the lysine in Kx 5 Y or Yx 8 KG motifs in ID from LRR-RLPs.Domain or region structures (from left to right) are obtained from: BRI1 ectodomain (3RGX); RXEG1 ectodomain (7W3X); PSKR1 ID (4Z63); RXEG1 ID (7W3X); PSKR1 C3 (4Z63); RXEG1 C3 (7W3X); PSKR1 ejM-TM (predicted from Alphafold2 20 ); RLP23 eJM-TM (predicted from Alphafold2 20 ); BRI1 kinase (4OH4).Structures were visualized in iCn3D 98 .For b, c, e and f, the experiments were repeated at least twice with similar results.Technology, USA) in a solution of 5% BSA in TBST overnight at 4 °C.HA-tagged BES1 or cell surface receptors were detected using Anti-HA-Peroxidase, High Affinity, rat IgG 1 antibody (Roche) in a solution of 5% skimmed milk in TBST overnight at 4 °C.For detection of MAPKs, this was followed by incubation with α-rabbit IgG-HRP-conjugated secondary antibodies (1:10,000, Roche, USA) in a solution of 5% BSA in TBST for an hour at room temperature.HRP signal was then detected by Clarity Western ECL Substrate (Bio-Rad) with a LAS 4000 system (GE Healthcare, USA).Nitrocellulose membranes were stained with Coomassie Brilliant Blue (CBB) to ensure equal loading.ROS assays ROS burst assays were performed as described previously 93 .N. benthamiana leaf discs were collected with a 4-mm-diameter cork borer and placed in 96-well plates with 120 μl deionised water overnight in the dark (abaxial surface of the leaves facing down).N. benthamiana leaf discs were then treated with either mock (water) or 1 μM nlp20 in 20 mM luminol (Wako, Japan) and 0.02 mg ml −1 horseradish peroxidase (Sigma-Aldrich).Luminescence was then measured over indicated periods of time with a Tristar2 multimode reader (Berthold Technologies, Germany).

Reporting summary
Further information on research design is available in the Nature Portfolio Reporting Summary linked to this article.

Fig. 3 |
Fig. 3 | The origin and evolution of cell-surface receptor signalling component in plants.a Schematic figure represents the simplified PTI signalling pathway in plants.Coloured hexagons on RLKs indicate activated kinases.For details, refer to Supplementary Fig. 6. bThe top panel is a sequence similarity tree of multiple algal and plant lineages.Circles (○) and stars (☆) indicate the origins and expansion of receptor families, respectively.The timescale (in million years; MYA) of the sequence similarity tree was estimated by TIMETREE599 .The bottom panel displays the presence or absence of receptor classes in different algal and plant lineages.*M/ C/K represents Mesostigmatophyceae, Chlorokybophyceae, and Klebsormidiophyceae.The number of available species from each algal and plant lineage is indicated within the respective boxes.A grey box indicates the absence, while a green box indicates the presence of a given protein family in each lineage.Dark green indicates the presence of orthologs of immunity-related (PTI) signalling components within that protein family (see also Supplementary Fig.8).The origin of a protein family is indicated with a circle (○), followed by another circle indicating the origin of the orthologs of PTI-signalling component.Expansion rates of PTI-signalling component families are indicated by boxplots.The percentages (%) of signalling components from each genome were calculated as (number of identified genes/number of searched genes × 100).Next, the percentages from each species within a lineage (e.g, Rhodophtya or green algae) were grouped and the = 20; MAPKKKs, n = 20; MAPKKs, n = 20; MAPKs, n = 20; CNGCs, n = 19; OSCAs, n = 19; RBOHs, n = 10; EP proteins, n = 0; RPW8-NLRs, n = 0).The yellow boxplot represents the expansion rate from green algae to Embryophytes (SERKs, n = 316; SOBIR1, n = 0; RLKCs (VII), n = 324; CDPKs, n = 324; MAPKKKs, n = 324; MAPKKs, n = 324; MAPKs, n = 324; CNGCs, n = 324; OSCAs, n = 322; RBOHs, n = 324; EP proteins, n = 0; RPW8-NLRs, n = 0) and the orange boxplot represents the differences between early land plants to Tracheophytes (SERKs, n = 307; SOBIR1, n = 309; RLKCs (VII), n = 314; CDPKs, n = 314; MAPKKKs, n = 314; MAPKKs, n = 314; MAPKs, n = 314; CNGCs, n = 314; OSCAs, n = 312; RBOHs, n = 314; EP proteins, n = 0; RPW8-NLRs, n = 0).Light blue area represents expansion and light pink area represents contraction of the gene family.X-axis values represent expansion rate (×).Values larger than 0 indicate expansion; values equal to 0 indicate no expansion, and values below 0 indicate contraction.Boxplot elements: centre line, median; bounds of box, 25th and 75th percentiles; whiskers, 1.5 × IQR from 25th and 75th percentiles.For details, refer to the methods.

Fig. 4 |
Fig. 4 | The origin and evolution of LRR ID + 4LRR in plants.a Distribution of LRR-RLKs and LRR-RLPs in LRR-containing cell surface receptors (LRR-TM) from 350 plant species, and the distribution of LRR-RLPs and LRR-RLKs with or without gaps of 10-29 amino acids (10-29) or 30-90 (30-90) amino acids.b, c Position of large gaps (IDs; 30-90 AA) in b LRR-RLKs and c LRR-RLPs with a single large ID.N1 represents the number of LRR motifs before the IDs and N2 represents the number of LRR motifs after the ID.For positions of gaps in LRR-RLKs and LRR-RLPs with multiple gaps, refer to Supplementary Fig. 13.d The concentric ring pie chart presents the percentage of LRR-containing cell-surface receptors (PRRs) from 350 species.The inner ring represents all LRR-containing cell-surface receptors (113,794); the middle ring represents LRR-containing PRRs with ID (20,556); the outer ring represents LRR-containing PRRs with an ID preceding the last 4 LRR (ID + 4LRR) at the C terminus (16,885).e The presence or absence of receptor classes in various algal and plant lineages.*M/C/K represents Mesostigmatophyceae, Chlorokybophyceae, and Klebsormidiophyceae.A grey box indicates the absence, and a green box indicates the presence of a given receptor class in each lineage.The origin of LRR-RLP and LRR-RLK-Xb with ID + 4LRR is indicated with a circle (○).f Sequence similarity tree of the C3 region (last 4 LRRs) from all LRR-containing cell-surface receptors of 350 species.Branches are colourlabelled as indicated.The inner ring and middle ring indicate the lineage and subclass/order of the corresponding protein (species) from the branch.Outer ring represents the LRR-RLP or LRR-RLK classification, which is indicated in d.The light grey area indicates clustering of LRR-RLK-Xb and LRR-RLP with ID + 4LRR.The pruned sequence similarity tree on the right g corresponds to the light grey area in the left tree, with clades labelled in dark grey areas accordingly.Characterized LRR-RLK-Xb and LRR-RLP members are labelled.The BRI/BRL-clade and the PSKR/ PSY1R-clades are also labelled.

Fig. 5 |
Fig. 5 | Functional characterization of the C3 region in LRR-RLPs and LRR-RLK-Xb.a-c Structures and interaction interfaces of LRR-RLKs and LRR-RLPs with SERKs.Published structures of a NbRXEG1-NbBAK1 53 , b AtPSKR1-AtSERK148 , and c AtBRI1-AtBAK1 50 and are shown.The left panels show the full structure, and the middle panels show the interaction sites between LRR-RLKs or LRR-RLP and SERKs.Hydrogen bonds are indicated by green dotted lines, and salt bridges are shown as cyan dotted lines.The positions of LRR residues (counting from N to C for SERKs and counting from C to N for LRR-RLKs and LRR-RLP) are shown.Amino acid residues that are important for the interactions are labelled and the QxxT motifs are highlighted in yellow (red text).The right panel represents the 2D interaction network between SERKs and the receptors.Contacts/interactions are shown in grey lines, hydrogen bonds are shown in green lines, and salt bridges are shown in cyan lines.Amino acids are labelled in colours according to their positions in the LRR motifs (counting from N to C for SERKs and counting from C to N for LRR-RLKs and LRR-RLP l).Residues around and within the QxxT motifs in BRI1, PSKR1, and RXEG1 are highlighted in yellow.Residues in SERKs that are involved in the interactions with QxxT motifs are also highlighted in yellow.Structures were visualized in iCn3D98 .For a-c, the interaction sites are calculated by iCn3D with the following thresholds: hydrogen bonds: 4.2 Å; salt bridges/ionic bonds: 6 Å; contacts/interactions: 4 Å.d Structure of the terminal LRR motif of N. benthamiana (Nb)RXEG1, A. thaliana (At)RLP23, AtPSY1R, AtPSKR1, AtPSKR2, AtBRI1 and AtEFR.Structures of /doi.org/10.1038/s41467-023-44408-3Nature Communications | (2024) 15:308

Fig. 6 |
Fig. 6 | On the origin of LRR-RLPs and LRR-RLK-Xbs.a Sequence similarity tree of IDs extracted from all LRR-containing PRRs.Branches are labelled in colours as indicated.Grey clade represents clade that contains both PSKR/PSY1R & BRI/BRL family, this clade is shown in b.In b, characterized LRR-RLK-Xb and LRR-RLP members are labelled.The BRI/BRL-clade, the PSKR clade, the PSY1R clade, and a clade with monocot-only IDs are labelled in different colours.In a, b, inner ring (Inner), middle ring (Middle) and outer ring (Outer) are labelled as indicated.Outer ring represents LRR-RLP and LRR-RLKs classification shown in Fig.4d.In the outmost three rings, the presence of Kx5Y (Kx5Y), Yx8K (Yx8K), or either Kx5Y or Yx8K (Either) in the IDs are indicated in green, and absence of these motifs is indicated in grey.c Percentage (%) of IDs (before the last four LRR motifs) from LRR-RLP and LRR-RLK-Xb with the Kx5Y, Yx8KG, or either Kx5Y or Yx8KG (*) motifs.The Fisher test (2-sided) was performed to compare the number/fraction of IDs with either Kx5Y or Yx8KG in LRR-RLPs against LRR-RLK-Xb.The calculated p-value stated here (<1e-16) is too low to be given exact number.Thus, the upper bound limit is stated instead.d, g Structures and interaction interfaces of AtBRI1-Brassinosteroid (BR) and AtPSKR1-Phytosulfokine (PSK).Published structures of d AtBRL1-BR100 , g AtPSKR1-PSK48 are shown.The left panels show the interaction sites between the LRR-RLK-Xb receptors and their ligands.Contacts are indicated by grey lines, hydrogen bonds are indicated by green dotted lines, and salt bridges are shown as cyan dotted lines.The positions of LRR residues (counting from C to N) are shown.Amino acid residues that are important for the interactions are highlighted in yellow and green, respectively.The right panel represents the 2D interaction network between the LRR-RLK-Xb receptors and their ligands.Contacts/interactions are shown in grey lines, hydrogen bonds are shown in green lines, and salt bridges

Fig. 7 |
Fig.7| Alignment and features of the terminal four C3, eJM, and TM region in LRR-RLKs and LRR-RLPs.a Alignment of the C3, eJM, and TM region in LRR-RLKs (from subgroups II, XII, XI, Xa, Xb) and LRR-RLPs.The BRI1/BRL clade is highlighted in purple; PSKR/PSY1R clade is highlighted in cyan and LRR-RLP with the ID + 4LRR clade (see Fig.4c) is highlighted in light green.Amino acid residues involved in brassinosteroid (BR) binding for BRI/BRL and residues required for SERK interaction for BRI1/BRL, PSKR1, and RXEG1 (LRR-RLP) are highlighted.The colour code for each highlight is indicated in the box on top.For the eJM region, amino acid residues with negative charges are highlighted in red, and amino acids with positive charges are in highlighted blue.The GxxxG motif in the TM region is highlighted in yellow.The interaction sites were calculated using iCn3D98 with the following thresholds: hydrogen bonds: 4.2 Å; salt bridges/ionic bonds: 6 Å; contacts/interactions: 4 Å.For details, please refer Fig.5.b Percentages of LRR-RLKs or LRR-RLPs with the stated amino acid residues in the corresponding position in a. Percentages (%) were calculated by the number of LRR-RLKs or LRR-RLPs in the subgroup with the stated residue divided by the number of LRR-RLKs or LRR-RLPs in the subgroup without the stated residue × 100.c Sequence similarity tree of the eJM-TM-cJM region from all LRR-containing cell-surface receptors of 350 species.Branches are Fig.7| Alignment and features of the terminal four C3, eJM, and TM region in LRR-RLKs and LRR-RLPs.a Alignment of the C3, eJM, and TM region in LRR-RLKs (from subgroups II, XII, XI, Xa, Xb) and LRR-RLPs.The BRI1/BRL clade is highlighted in purple; PSKR/PSY1R clade is highlighted in cyan and LRR-RLP with the ID + 4LRR clade (see Fig.4c) is highlighted in light green.Amino acid residues involved in brassinosteroid (BR) binding for BRI/BRL and residues required for SERK interaction for BRI1/BRL, PSKR1, and RXEG1 (LRR-RLP) are highlighted.The colour code for each highlight is indicated in the box on top.For the eJM region, amino acid residues with negative charges are highlighted in red, and amino acids with positive charges are in highlighted blue.The GxxxG motif in the TM region is highlighted in yellow.The interaction sites were calculated using iCn3D98 with the following thresholds: hydrogen bonds: 4.2 Å; salt bridges/ionic bonds: 6 Å; contacts/interactions: 4 Å.For details, please refer Fig. 5. b Percentages of LRR-RLKs or LRR-RLPs with the stated amino acid residues in the corresponding position in a. Percentages (%) were calculated by the number of LRR-RLKs or LRR-RLPs in the subgroup with the stated residue divided by the number of LRR-RLKs or LRR-RLPs in the subgroup without the stated residue × 100.c Sequence similarity tree of the eJM-TM-cJM region from all LRR-containing cell-surface receptors of 350 species.Branches are colour-labelled as indicated.The inner ring and middle ring indicate the lineage and subclass/order of the corresponding protein (species) from the branch.Outer ring represents the LRR-RLP or LRR-RLK classification, which is indicated in Fig. 4d.Characterized LRR-RLK-Xb and LRR-RLP members are labelled.The BRI/BRL-clade and the PSKR/PSY1R-clades are also labelled.d Overall charge distribution in eJM (left), percentage of receptors with GxxxG (middle), and GxxxGxxxG (right) in the TM region.For b, d RLP/RLK-Xb (outside clade) refers to receptors outside the light grey clade in Fig. 4f.RLP/RLK-Xb (within clade) refers to receptors inside the light grey clade in Fig. 4f.Number of cell-surface receptors (n) in each LRR-RLK subgroup: I, n = 752; II, n = 682; III, n = 6572; IV, n = 1033; V, n = 8; VI-1, n = 84; V1-2, n = 146; VII, n = 1720; VIII-1, n = 195; VIII-2, n = 411; IX, n = 70; Xa, n = 96; Xb, n = 3182; XI, n = 8807; XII, n = 12863; XIIIa, n = 739; XIIIb, n = 465; XIV, n = 241; XV, n = 548; Xb (outside clade), n = 580; Xb (within clade), n = 2527; BRI/BRL clade, n = 1170; PSKR/PSY1R clade, n = 1347.Number of cell-surface receptors (n) in each LRR-RLP subgroup: LRR-RLP (RLP), n = 24970; RLP (outside clade), n = 5000; RLP (within clade), n = 19970.Boxplot elements: center line, median; bounds of box, 25th and 75th percentiles; whiskers, 1.5 × IQR from 25th and 75th percentiles.
indicated time points.The tissues were then lysed in liquid nitrogen and extracted in 1×NuPAGE™ LDS Sample Buffer (Invitrogen™) with 10 mM DTT at 70 °C for 10 min.Total proteins were then separated by SDS-PAGE and blotted onto a nitrocellulose membrane (Trans-Blot Turbo Transfer System, Bio-Rad).The membrane was then blocked in a solution of either 5% skimmed milk (for BES1 and cell-surface receptor detection) or 5% bovine serum albumin (BSA; for MAPK detection) in Tris-buffered saline, 0.1% Tween 20 detergent (TBST) for an hour.Phosphorylated MAPKs were detected using α-phospho-p44/42 MAPK rabbit monoclonal antibody (D13.14.4E, in 1:2000, Cell Signalling

Fig. 9 |
Fig. 9 | The evolutionary trajectory of PTI in plants.a The top panel depicts the presence of cell-surface receptors, co-receptors, cytoplasmic kinases (cytoplasmic Ks), and downstream signalling components (downstream) in Glaucophyta and Rhodophyta, green algae, Bryophytes, and Tracheophytes.Nodes are labelled in colours as indicated on the left.The absence of a node indicates the absence of a gene family from the lineage.Nodes with dotted outlines indicate the presence of a gene family, but the absence of immunity-related orthologs.Nodes with thick outlines indicate the expansion of gene families.Repeated expansion is indicated with thicker outlines.The middle panel (top) displays the percentages (%) of cellsurface receptors and signalling components in the genome of each species within the lineage.Bars are labelled in colours as indicated.Middle panel (bottom) represents the distribution of signalling components, including co-receptors, cytoplasmic kinases, and downstream signalling components, in the genome of each species within a lineage.Bars are labelled in colours as indicated on top left.The bottom panel shows examples of plant species and the classification of the plant lineages.Gla Glaucophyta, MCK represents Mesostigmatophyceae, Chlorokybophyceae, and Klebsormidiophyceae; C: Charophyta, Zyg Zygnematophyceae, Marcha Marchantiophyta, Mosses Bryophyte, Anthocero Anthocerotophyta, F Lycophytes and Polypodiophyta, G Gymnosperms.b Evolution of PTI in plants.Left panel: expansion of PRR family gene repertoires throughout the plant lineage, which leads to recognition of a larger range of PAMPs/MAMPs.Middle panel: PRR co-receptors, EP proteins, and helper NLRs are absent from many algal species.A more complex immune network involving these signalling components apparently developed in vascular plants.Right panel: An ancient PRR with LRR ID + 4LRR with unknown function evolved into LRR-RLK-Xbs and LRR-RLPs, which are involved in development-and immune-signalling, respectively.eJM-TM-cJM region of LRR-RLPs evolved to allow interactions with SOBIR1 to induce immunity (negatively charged eJM and GxxxG).LRR-RLK-Xbs utilize Xb kinase domains to induce distinct downstream responses.

a Sequence similarity tree of ID in LRR-containing cell surface receptors
Gla Glaucophyta, MCK represents Mesostigmatophyceae, Chlorokybophyceae, and Klebsormidiophyceae; C: Charophyta, Zyg Zygnematophyceae, Marcha Marchantiophyta, Mosses Bryophyte, Anthocero Anthocerotophyta, F Lycophytes and Polypodiophyta, G Gymnosperms.b Evolution of PTI in plants.Left panel: expansion of PRR family gene repertoires throughout the plant lineage, which leads to recognition of a larger range of PAMPs/MAMPs.Middle panel: PRR co-receptors, EP proteins, and helper NLRs are absent from many algal species.A more complex immune network involving these signalling components apparently developed in vascular plants.Right panel: An ancient PRR with LRR ID + 4LRR with unknown function evolved into LRR-RLK-Xbs and LRR-RLPs, which are involved in development-and immune-signalling, respectively.eJM-TM-cJMregion of LRR-RLPs evolved to allow interactions with SOBIR1 to induce immunity (negatively charged eJM and GxxxG).LRR-RLK-Xbs utilize Xb kinase domains to induce distinct downstream responses.Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made.The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material.If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder.To view a copy of this licence, visit http://creativecommons.org/ licenses/by/4.0/.